All Questions
Tagged with scipyscikit-learn
16 questions
1vote
1answer
35views
scipy bootstrap generates input with inconsistent numbers of samples
I have a dataset of 77 samples, and I am using scipy bootstrap to get a confidence interval to estimate the precision. I am baffled to see that it generates input variables with inconsistent numbers ...
0votes
0answers
30views
Agglomerative clustering with min and max cluster size constraints
Are there any python packages that have agglomerative clustering algorithms which have min and max cluster size constraints built in? I found a great package called KMeansConstrained but unfortunately ...
0votes
2answers
908views
Solve a non-linear system, in Python, with the GAUSS-NEWTON algorithm? (Jacobian matrix J, etc.)
I would like to solve a non-linear system (which contains the goals of a football team in previous matches) using the Gauss-Netwon algorithm, in order to find the parameter (of frequency) to use as ...
3votes
2answers
2kviews
Incremental clustering algorithm
I am looking for an incremental clustering algorithm. By incremental I mean an algorithm that builds clusters starting from an initial dataset and that is able to progressively ingest new items/...
0votes
1answer
183views
Optimize a non-linear function in Python
I am trying to optimize a function using scipy.optimize, but it does not converge. I have a trading strategy with a default stop-loss based on the lowest price over 20 days. I want to optimize this ...
1vote
0answers
19views
Similarity between binary vector with hierarchal structure
I have dataset of binary vectors, where each vector composed from several small vector coming from a different parent category. Each of those categories has a different size e.g. ...
2votes
0answers
35views
What would be a good randomization environment for data science?
I would like to know if there are any best practices to optimize random environment. Currently I use this simple structure in my config : ...
3votes
1answer
690views
Smaller alternatives to sklearn that doesn't require scipy?
I am packaging my model for deployment in aws lambda which has a size limit of 250mb for all dependencies. Sklearn, if you include its dependencies of numpy and scipy is a huge package. Are there any ...
3votes
2answers
164views
How to interpret ANOVA results?
I am trying to identify what attributes are not relevant in my dataset to remove them before fitting a classifier. The target is a categorical variable with three different values. I also have a lot ...
2votes
1answer
5kviews
How do I force specified coefficients in a Linear Regression model to be positive?
Looking for a way to do this in Python. scipy.optimize.nnls forces all coefficients to be positive. Some additional context: I have a data frame with a some explanatory variables and a response ...
1vote
1answer
3kviews
How to measure the correlation between categorical variables and a continuous variable
I have the following list of the names of the categorical variables in my dataset: ...
1vote
1answer
3kviews
What does sklearn's pairwise_distances with metric='correlation' do?
I've put different values into this function and observed the output. But I can't find a predictable pattern in what is being outputed. Then I tried digging through the function itself, but its ...
1vote
1answer
3kviews
Machine learning with sklearn vs. scipy stats
I've created 50 random x and y points (with slope of y = 2x-1). First, I used Linear Regression from sklearn to fit the model onto my dataset where I got a slope of ...
2votes
1answer
32views
Normally distribute occurence or counts
I am creating a mock of sales data. One of the columns is salesperson_id where each id can occur more than once (a salesperson can have multiple sales). I want to ...
0votes
1answer
44views
Restricting a weight vector (optimization parameter) to be in a certain domain using python ML library linear regression model
Sorry if the title is a bit long, but basically I'm trying to predict values $$ \hat{y}_i \in [-1,1]$$ using a simple model i.e. something like OLS or ridge regression, I'd like to know if anyone ...